Two Stage Neural Network model for Recognition of Indian Languages from Speech
نویسندگان
چکیده
India is a multilingual country. Officially about 20 languages are recognized by the government and there are about 500 languages spoken at different parts of the country. For developing the speech systems in Indian context, it is necessary to capture the language specific knowledge automatically from speech. Further it may be exploited for different speech tasks such as language identification, speaker identification, speech recognition and so on. In this paper we focus on identification of 15 Indian languages from speech using spectral features. In this work, spectral features are represented by Mel frequency cepstral coefficients. The 15 Indian languages considered for this study are: Assamese (As), Bengali (Ben), Gujarati (Guj), Hindi (Hi), Kannada (Ka), Kashmiri (Kas), Malayalam (Mal), Marathi (Mar), Nepali (Nep), Oriya (Or), Punjabi (Pun), Rajasthani (Raj), Tamil (Ta), Telugu (Te) and Urdu (Ur). Here neural network models are explored for capturing the language specific information from the spectral features. Its known that the vocal tract characteristics are different, for producing the sound units in different languages [1]. Directly attempting to develop simple language models using neural networks may not give the optimal performance, as most of the Indian languages are originated from the single ancient root language Sanskrit, and hence they are confusing. The degree of confusion is very high in the languages of neighboring geographical regions.
منابع مشابه
Persian Phone Recognition Using Acoustic Landmarks and Neural Network-based variability compensation methods
Speech recognition is a subfield of artificial intelligence that develops technologies to convert speech utterance into transcription. So far, various methods such as hidden Markov models and artificial neural networks have been used to develop speech recognition systems. In most of these systems, the speech signal frames are processed uniformly, while the information is not evenly distributed ...
متن کاملSpeech Emotion Recognition Using Scalogram Based Deep Structure
Speech Emotion Recognition (SER) is an important part of speech-based Human-Computer Interface (HCI) applications. Previous SER methods rely on the extraction of features and training an appropriate classifier. However, most of those features can be affected by emotionally irrelevant factors such as gender, speaking styles and environment. Here, an SER method has been proposed based on a concat...
متن کاملشبکه عصبی پیچشی با پنجرههای قابل تطبیق برای بازشناسی گفتار
Although, speech recognition systems are widely used and their accuracies are continuously increased, there is a considerable performance gap between their accuracies and human recognition ability. This is partially due to high speaker variations in speech signal. Deep neural networks are among the best tools for acoustic modeling. Recently, using hybrid deep neural network and hidden Markov mo...
متن کاملImproving Phoneme Sequence Recognition using Phoneme Duration Information in DNN-HSMM
Improving phoneme recognition has attracted the attention of many researchers due to its applications in various fields of speech processing. Recent research achievements show that using deep neural network (DNN) in speech recognition systems significantly improves the performance of these systems. There are two phases in DNN-based phoneme recognition systems including training and testing. Mos...
متن کاملروشی جدید در بازشناسی مقاوم گفتار مبتنی بر دادگان مفقود با استفاده از شبکه عصبی دوسویه
Performance of speech recognition systems is greatly reduced when speech corrupted by noise. One common method for robust speech recognition systems is missing feature methods. In this way, the components in time - frequency representation of signal (Spectrogram) that present low signal to noise ratio (SNR), are tagged as missing and deleted then replaced by remained components and statistical ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2010